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Abstract 



We characterize the search landscape of random instances of the job shop scheduhng 
problem (JSP). Specifically, wc investigate how the expected values of (1) backbone size, 
(2) distance between near-optimal schedules, and (3) makespan of random schedules vary 
as a function of the job to machine ratio (^). For the limiting cases — and ^ — )• oo 
we provide analytical results, while for intermediate values of we perform experiments. 
We prove that as ^ — >■ 0, backbone size approaches 100%, while as ^ — t- cx) the backbone 
vanishes. In the process we show that as ^ — )■ (resp. ^ — >■ oo), simple priority rules 
almost surely generate an optimal schedule, providing theoretical evidence of an "easy- 
hard-easy" pattern of typical-case instance difficulty in job shop scheduling. We also draw 
connections between our theoretical results and the "big valley" picture of JSP landscapes. 

1. Introduction 
1.1 Motivations 

The goal of this work is to provide a picture of the typical landscape of a random instance 
of the job shop scheduling problem (JSP), and to determine how this picture changes as 
a function of the job to machine ratio {jj)- Such a picture is potentially useful in (1) 
understanding how typical-case instance difficulty varies as a function of ^ and (2) designing 
or selecting search heuristics that take advantage of regularities in typical instances of the 



1.1.1 Understanding instance difficulty as a function of ^ 

The job shop scheduling literature contains much empirical evidence that square JSPs (those 
with = 1) are more difficult to solve than rectangular instances (Fisher & Thompson, 
1963). This work makes both theoretical and empirical contributions toward understanding 
this phenomenon. Empirically, we show that both random schedules and random local 
optima are furthest from optimality when 1. Analytically, we prove that in the two 

limiting cases — ^ and — t- oo) there exist simple priority rules that almost surely 
produce an optimal schedule, providing theoretical evidence of an "easy-hard-easy" pattern 
of instance difficulty in the JSP. 
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1.1.2 Informing the design of search heuristics 

Heuristics based on local search, for example tabu search (Glover & Laguna, 1997; Nowicki 
k, Smutnicki, 1996) and iterated local search (Lourengo, Martin, and Stiitzle, 2003), have 
shown excellent performance on benchmark instances of the job shop scheduling problem 
(Jain &: Meeran, 1998; Jones &; Rabelo, 1998). In order to design an effective heuristic, one 
must (explicitly or implicitly) make assumptions about the search landscape of instances 
to which the heuristic will be applied. For example, Nowicki and Smutnicki motivate the 
use of path relinking in their state-of-the-art i-TSAB algorithm by citing evidence that the 
JSP has a "big valley" distribution of local optima (Nowicki &; Smutnicki, 2005). One of 
the conclusions of our work is that the typical landscape of random instances can only be 
thought of as a big valley for values of ^ close to 1; for larger values of ^ (including values 
common in benchmark instances), the landscape breaks into many big valleys, suggesting 
that modifications to i-TSAB may allow it to better handle this case (we discuss i-TSAB 
further in §9.3). 

1.2 Contributions 

The contributions of this paper arc twofold. First, we design a novel set of experiments and 
run these experiments on random instances of the JSP. Second, we derive analytical results 
that confirm and provide insight into the trends suggested by our experiments. 
The main contributions of our empirical work are as follows. 

• For low values of we show that low-makespan schedules are clustered in a small 
region of the search space and many attributes (i.e., directed disjunctive graph edges) 
are common to all low-makespan schedules. As ^ increases, low-makespan schedules 
become dispersed throughout the search space and there are no attributes common 
to all low-makespan schedules. 

• We introduce a statistic (neighborhood exactness) that can be used to quantitatively 

measure the "smoothness" of a search landscape, and estimate the expected value of 
this statistic for random instances of the JSP. These results, in combination with the 
results on clustering, suggest that the landscape of typical instances of the JSP can 
be described as a big valley only for low values of ^ ; for high values of ^ there are 
many separate big valleys. 

For the limiting cases — >■ and ^ oo, we derive analytical results. Specifically, 
we prove that 

• as — >■ 0, the expected size of the backbone (i.e., the set of problem variables that 
have a common value in all global optima) approaches 100%, while as ^ ^ oo, the 
expected backbone size approaches 0%; and 

• as — 7- (resp. — )■ oo), a randomly generated schedule will almost surely (a) be 
located "close" in the search space to an optimal schedule and (b) have near-optimal 
makespan. 
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2. Related Work 

There are at least three threads of research that have conducted search space analyses 
related to the ones we conduct here. These include literature on the "big valley" distribution 
common to a number of combinatorial optimization problems, studies of backbone size in 
Boolean satisfiability, and a statistical mechanical analysis of the TSP. We briefly review 
these three areas below, as well as relevant work on phase transitions and the "easy-hard- 
easy" pattern of instance difficulty. 

2.1 The Big Valley 

The term "big valley" originated in a paper by Boese et al. (1994) that examined the 
distribution of local optima in the Traveling Salesman Problem (TSP). Based on a sample 
of local optima obtained by next-descent starting from random TSP tours, Boese calculated 
two correlations: 

1. the correlation between the cost of a locally optimal tour and its average distance to 
other locally optimal tours, and 

2. the correlation between the cost of a locally optimal tour and the distance from that 
tour to the best tour in the sample. 

The distance between two TSP tours was defined as the total number of edges minus 
the number of edges that are common to the two tours. Based on the fact that both of 
these correlations were surprisingly high, Boese conjectured that local optima in the TSP 
are arranged in a "big valley". Adapted from the work of Boese et al. (1994), Figure 1 
gives "an intuitive picture of the big valley, in which the set of local minima appears convex 
with one central global minimum" (Boese et al., 1994). We offer a more formal definition 
of a big valley landscape in §6. 

Boese's analysis has been applied to other combinatorial problems (Kim & Moon, 2004) , 
including the permutation flow shop scheduling problem (Watson, Barbulescu, Whitley, &; 
Howe, 2002; Reeves & Yamada, 1998) and the JSP (Nowicki & Smutnicki, 2001). Correla- 
tions observed for the JSP are generally weaker than those observed for the TSP. 

In a related study, Mattfeld (1996) examined cost-distance correlations in the famous 
JSP instance ft 10 (Beasley, 1990) and found evidence of a "Massif Central. . . where many 
near optimal solutions reside laying closer together than other local optima." §4 contains 
related results on the backbone size of ft 10. 

2.2 Backbone Size 

The backbone of a problem instance is the set of variables that are assigned a common value 
in all globally optimal solutions of that instance. For example, in the Boolean satisfiability 
problem (SAT), the backbone is the set of variables that are assigned a fixed truth value 
in all satisfying assignments. In the JSP, the backbone has been defined as the number of 
disjunctive edges (§3.2) that have a common orientation in all globally optimal schedules 
(a formal definition is given in §4). 

There is a large literature on backbones in combinatorial optimization problems, in- 
cluding many empirical and analytical results (Slaney &; Walsh, 2001; Monasson, Zecchina, 
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Figure 1: An intuitive picture of a "big valley" landscape. 

Kirkpatrick, Selman, & Troyansky, 1999). In an analysis of problem difficulty in the JSP, 

Watson ct al. (2001) present histograms of backbone size for random 6x6 (6 job, 6 machine) 
and 6x4 (6 job, 4 machine) JSP instances. Summarizing experiments not reported in their 
paper, Watson et al. note that "For [jobrmachine ratios] > 1.5, the bias toward small back- 
bones becomes more pronounced, while for ratios < 1, the bias toward larger backbones 
is further magnified." §4 generalizes these observations and proves two theorems that give 
insight into why this phenomenon occurs. 

2.3 Statistical Mechanical Analyses 

A large and growing literature applies techniques from statistical mechanics to the analysis 
of combinatorial optimization problems (Martin, Monasson, &; Zecchina, 2001). At least 
one result obtained in this literature concerns clustering of low-cost solutions. In a study of 
the TSP, Mezard and Parisi (1986) obtain an expression for the expected overlap (number 
of common edges) between random TSP tours drawn from a Boltzmann distribution. They 
show that as the temperature parameter of the Boltzmann distribution is lowered (placing 
more probability mass on low-cost TSP tours), expected overlap approaches 100%. Though 
we do not use a Boltzmann weighting, §5 of this paper examines how expected overlap 
between random JSP schedules changes as more probability mass is placed on low-makespan 
schedules. 

2.4 Phase Transitions and the Easy-hard-easy Pattern 

Loosely speaking, a phase transition occurs in a system when the expected value of some 
statistic varies discontinuously (asymptotically) as a function of some parameter. As an 
example, for any e > it holds that random instances of the 2-SAT problem are satisfiable 
with probability asymptotically approaching 1 when the clause to variable ratio (™) is 1 — e, 
but are satisfiable with probability approaching when the clause to variable ratio is 1 -|- e. 
A similar statement is conjectured to hold for 3-SAT; the critical value k oi ^ (if it exists) 
must satisfy 3.42 < A; < 4.51 (Achlioptas & Peres, 2004). 

For some problems that exhibit phase transitions (notably 3-SAT) , average-case instance 
difficulty (for typical solvers) appears to first increase and then decrease as one increases 
the relevant parameter, with the hardest instances appearing close to the threshold value 
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(A) JSP instance 



(B) JSP schedule 
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Figure 2: (A) A JSP instance, (B) a feasible schedule for the instance, and (C) the disjunc- 
tive graph representation of the schedule. Boxes represent operations; operation 
durations are proportional to the width of a box; and the machine on which an 
operation is performed is represented by texture. In (C), solid arrows represent 
conjunctive arcs and dashed arrows represent disjunctive arcs (arc weights are 
proportional to the duration of the operation the arc points out of) . 



(Cheeseman, Kanefsky, & Taylor, 1991; Yokoo, 1997). This phenomenon has been referred 
to as an "easy-hard-easy" pattern of instance difficulty (Mammen & Hogg, 1997). In §7.4 
we discuss evidence of an easy-hard-easy pattern of instance difficulty in the JSP, though 
(to our knowledge) it is not associated with any phase transition. 

The results in §§4-5 and the empirical results in §6 were previously presented in a con- 
ference paper (Streeter & Smith, 2005a). 

3. The Job Shop ScheduHng Problem 
We adopt the notation [n] = {1,2, ... ,n}. 

3.1 Problem Definition 

Definition (JSP instance). An N by M JSP instance I = {J^ , . . . , J^} is a set of N 
jobs , where each job = (Jf , , . . . , Jf^) is a sequence of M operations. Each operation 
a = has an associated duration t(o) G {Q,Tmax\ o,nd machine m{o) € [M]. We require 
that each job uses each machine exactly once (i.e., for each ^ I and fh G [M], there is 
exactly one i G [M] such that m{jf) = fh). We define 

1. ops{I) = {jf:ke[N],ie[M]}, 

2. T{J^)^TZi<Ji)> 
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3. the job-predecessor i7( Jf ) of an operation jf 



as 




ifi > 1 



otherwise 



where is a fictitious operation with t{o^) = and m{o^) undefined. 

Definition (JSP schedule). A JSP schedule for an instance I is a function S : ops{I) — >■ 

that associates with each operation o G ops{I) a start time S{o) ( operation a is performed 
on machine m{o) from time S{o) to time S(o) + t{o); preemption is not allowed). We make 
the following definitions. 

1. The completion time of an operation a is S~^{o) = S{o) + t(o). 

2. The machine-predecessor M{o) of an operation a G ops{I) is 



where Oprev{o) = {o G ops{I) : m(o) = m{o),S{o) < S{o)} is the set of operations 
scheduled to run before o on o's machine. 

3. S is a feasible schedule if S{o) > m.SLK{S~^{J'{o)), S~^{Ai{o))) Vo G ops{I). 

4. The quantity 



is called the makespan of S. 

We consider the makespan-minimization version of the JSP, in which the goal is to find 
a schedule that minimizes the makespan. 

For the remainder of the paper, whenever we refer to a JSP schedule S we shall adopt 
the convention that 5(0®) =0 and we shall assume that 



(i.e., 5 is a so-called semi-active schedule, French, 1982). In other words, we ignore schedules 
with superfluous idle time at the start of the schedule or between the end of one operation 
and the start of another. 

Figure 2 (A) and (B) depict, respectively, a JSP instance and a feasible schedule for 
that instance. 

3.2 Disjunctive Graphs 

A schedule satisfying (3.1) can be uniquely represented by a weighted, directed graph called 

its disjunctive graph. In the disjunctive graph representation of a schedule S for a JSP 
instance I, each operation o G ops{I) is a vertex and a directed edge (oi, 02) indicates that 
operation oi completes before 02 starts. 




e{S) = max S+{o) 



oEops{I) 



S{o) = max(5+(J(o)),5+(7W(o))) Vo G ops{I) 



(3.1) 
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Definition (disjunctive grapli). The disjunctive graph G = Q{I,S) of a schedule S for 
a JSP instance I is the weighted, directed graph G = (V, w) defined as follows. 

• V = ops{I) U {0*^,0*}, where 0* (like o^) is a fictitious operation with r(o*) = and 
m(p*) undefined. 

• E = C\JD, where 

— C = {{J'{o),o) : e ops(/)}u{(J^, a*) : k £ [N]] is called the set of conjunctive 
arcs (which specify that cannot start until J{o) completes), and 

— D = {(01,02) : {01,02} C ops{I),m{oi) = 771(02), ^(oi) < 'S'(o2)} is called the set 
of disjunctive arcs (which specify, for each pair of operations performed on the 
same machine, which of the two operations is to be performed first). 

• W{{0l,02)) = t(0i). 

Figure 2 (C) depicts the disjunctive graph for the schedule depicted in Figure 2 (B). 
The connection between a schedule and its disjunctive graph is established by the following 
proposition (Roy & Sussmann, 1964). 

Proposition 1. Let S be a feasible schedule for I satisfying (3.1), and let G = S) be 
the corresponding disjunctive graph. Then i{S) is equal to the length of the longest weighted 
path from to 0* in G. 

Proof. For any operation o, let L(o) denote the length of the longest weighted path from 
o*^ to o in G. It suffices to show that for any o G ops{I), S{o) = L{o). This follows by 
induction on the number of edges in the path, with the base case S{o^) = L{o^) = 0. □ 

The undirected version of a disjunctive arc is called a disjunctive edge. 

Definition (disjunctive edge). Let I be a JSP instance. A disjunctive edge is a set 
{01,02} C ops{I) with m{oi) = 777(02). We define the following notation. 

• E{I) is the set of disjunctive edges for I. 

• Let S be a schedule for I and let e = {01,02} be a disjunctive edge. We denote by 
e{S) the unique arc in {(01,02), (02,01)} that appears in the disjunctive graph Q{I,S) 
(this arc is called the orientation of e in S). 

We measure the distance between two schedules ^i and ^2 for a JSP instance / by 
counting the number of disjunctive edges that are oriented in opposite directions in Q{I, Si) 
and g{I,S2). 

Definition (disjunctive graph distance). The disjunctive graph distance H^i — S'2|| 
between two schedules Si and S2 for a JSP instance I is defined by 

\\Si - S2\\ = |{e G E{I) : e{Si) + e(52)}| ■ 
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3.3 Random Schedules and Instances 

We define a uniform distribution over JSP instances as follows. Our distribution is identical 
to the one used by Taillard (1993). 

Definition (random JSP instance). A random N by M JSP instance I is generated as 
follows. 

1. Let (f)i,(f)2, ■ ■ ■ ,(f)N be random permutations of [M]. 

2. Let G he a probability distribution over (0, Tmax] with mean ji and variance > 0. 

3. Define I = {J^,J^,...,J^}, where m{J^ ) = (j)k{i) and each T{jf) is drawn (inde- 
pendently at random) from G. 

Note that this definition (and likewise, our theoretical results) assumes a maximum 
operation duration T„iax, but makes no assumptions about the form of the distribution of 
operation durations. For the empirical results reported in this paper, we choose operation 
durations from a uniform distribution over {1,2,..., 100}. 

Our proofs will frequently make use of priority rules. A priority rule is a greedy schedule- 
building algorithm that assigns a priority to each operation and, at each step of the greedy 
algorithm, assigns the earliest possible start time to the operation with minimum priority. 

Definition (priority rule). A priority rule vr is a function that, given an instance I and 
an operation o G ops{I), returns a priority n{I,o) G 3fi. The schedule S = 5(7r, /) associated 
with TT is defined by the following procedure. 

1. Unscheduled ^ ops{I), S{o^) ^ 0. 

2. While \Unscheduled\ > do: 

(a) Ready ^ {a £ Unscheduled : J^{o) ^ Unscheduled} . 
(h) d the element of Ready with least priority. 

(c) 5(o) ^max(5+(:7(o)),5+(7W(o))). 

(d) Remove d from Unscheduled. 

A priority rule is called instance-independent if, for any N by M JSP instance I and 
integers k G [N], i G [M], the value tt{I, jf) depends only on k, i, N , and M . 

We obtain a random schedule by assigning random priorities to each operation. The 
resulting distribution is equivalent to the one used by Mattfeld (1996). 

Definition (random schedule). A random schedule for an N by M JSP instance I is 
generated by performing the following steps. 

1 . Create a list L containing M occurrences of the integer k for each k G [N] ( we think 
of the M occurrences of k as representing the operations in the job J^). 

2. Shuffle L (obtaining each permutation with equal probability) . 

3. Return the schedule S{-Krand^ I) where 'Krandil-, Ji) = the index of the i*^ occurrence 
of k in L. 
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4. Number of Common Attributes as a Function of Meikespan 

The backbone of a JSP instance is the set of disjunctive edges that have a common ori- 
entation in all schedules whose makespan is globally optimal. For p > 1, we define the 
P-backbone to be the set of disjunctive edges that have a common orientation in all sched- 
ules whose makespan is within a factor p of optimal (a related definition appears in Slaney 
& Walsh, 2001). 

Definition {pJbackbone). Let I be a JSP instance with optimal makespan lmin{I)- For 
p > 1, let pjopt{I) = {S : £{S) < p ■ £min{I)} be the set of schedules whose makespan is 
within a factor p of optimal. Then 

p.backboneil) = {e € E{I) : e{Si) = e{S2) V{5i, ^2} ^ P-opt{I)] . 

In this section we compute the expected value of \p_backbone\ as a function of p for 
random N hy M JSP instances, and examine how the shape of this curve changes as a 
function of 

4.1 Computing the pJbackbone 

To compute the p-backbone we use the following proposition. 

Proposition 2. Let I be a JSP instance with optimal makespan £min{I)- Let e = {01,02} 

be a disjunctive edge with orientations ai = (01,02) and 02 = (02,01). For any disjunctive 
arc a, let £rnin{I\o) denote the optimum makespan among schedules whose disjunctive graph 
contains the arc a. Then 

e e pJ)ackbone{I) meix{imin{l\ai),^minil\a2)} > P ■ lmin{I) ■ 

Proof. If e G pJbackbone, then e must have a common orientation (say oi) in all schedules 

S with (.{S) < p • imin{L), which implies lminil\a2) > p • £min{I)- If e ^ p.backbone, then 
there must be some {5i,52} C p_opt{I) with e{Si) = ai and e{S2) = 02, which implies 
max{^„j„(/|ai), ^„i„(/|a2)} < p ■ imin{I)- □ 

Thus to compute pJ)ackbone(^I) we need only to compute imin 

{I\a) for the 2M{^^) 

possible choices of a. Given a disjunctive arc a, we compute irnin 

{I\a) using branch and 

bound. In branch and bound algorithms for the JSP, nodes in the search tree represent 
choices of orientations for a subset of the disjunctive edges. By constructing a root search 

tree node that has a as a fixed arc, we can determine lmin{I\a,). We use a branch and bound 
algorithm due to Brucker et al. (1994) because it is efficient and because the code for it is 
freely available via ORSEP (Brucker, Jurisch, & Sievers, 1992). 

Computing imin{I\a) for the 2M[^) possible choices of a requires only 1 + M(^) runs 
of branch and bound. The first run is used to find a globally optimal schedule, which gives 
the value of iminilW) for M(J^^ possible choices of a (namely, the -M(^) disjunctive arcs 
that are present in the globally optimal schedule) . A separate run is used for each of the 
M(^) remaining choices of a. 

Figure 3 graphs the fraction of disjunctive edges that belong to the pJjackbone as a 
function of p for instance ft 10 (a 10 job, 10 machine instance) from the OR library (Beasley, 
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Figure 3: Normalized \p.backbone\ as a function of p for OR library instance ft 10. 

1990). Note that by definition the curve is non- increasing with respect to p, and that the 
curve is exact for all p. It is noteworthy that among schedules whose makespan is within a 
factor 1.005 of optimal, 80% of the disjunctive edges have a fixed orientation. We will see 
that this behavior is typical of JSP instances with ^ = 1. 

4.2 Results 

We plotted \p.backbone\ as a function of p for all instances in the OR library having 10 or 
fewer jobs and 10 or fewer machines. The results are available online (Streeter & Smith, 
2005b). Inspection of the graphs revealed that the shape of the curve is largely a function 
of the job:machine ratio. To investigate this further, we repeat these experiments on a large 
number of randomly generated JSP instances. 

We use randomly generated instances with 7 different combinations of N and Mto study 
instances with ^ equal to 1, 2, or 3. For = 1 we use 6x6, 7x7, and 8x8 instances; for 
^ = 2 we use 8x4 and 10x5 instances; and for = 3 we use 9x3 and 12x4 instances. We 
generate 1000 random instances for each combination of N and M. 

Figure 4 parts (A), (B), and (C) graph the expected fraction of edges belonging to the 
/9-backbonc as a function of p for each combination of N and M, grouped according to 
Figure 4 (D) compares the curves for different values of -p^, and plots the 0.25 and 0.75 
quantiles. For the purposes of this study the two most important observations about Figure 
4 are as follows. 

• The curves depend on both the size of the instance (i.e., NM) and the shape (i.e., 
■^). Of these two factors, jj has by far the stronger influence on the shape of the 
curves. 

• For all values of p, the expected fraction of edges belonging to the p -backbone decreases 
as 4t increases. 
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(A) Job:machine ratio 1:1 



(B) Job:machine ratio 2:1 
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(C) Job:machine ratio 3:1 
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Figure 4: Expected fraction of edges in p-backbone as a function of p for random JSP 
instances. Graphs (A), (B), and (C) depict curves for random instances with ^ 
= 1, 2, and 3, respectively. Graph (D) compares the curves depicted in (A), (B), 
and (C) (only the curves for the largest instance sizes are shown in (D)). In (D), 
top and bottom error bars represent 0.75 and 0.25 quantiles, respectively. 
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4.3 Analysis 

We now give some insight into Figure 4 by analyzing two limiting cases. We prove that as 
^— )'0, the expected fraction of disjunctive edges that belong to the backbone approaches 
1, while as t-oo this expected fraction approaches 0. 

Intuitively, what happens is as follows. As >0 (i.e., N is held constant and M— ^oo) 
each of the jobs becomes very long. Individual disjunctive edges then represent precedence 
relations among operations that should be performed very far apart in time. For example, 
if there are 10,000 machines (and so each job consists of 10,000 operations), a disjunctive 
edge might specify whether operation 1,200 of job A is to be performed before operation 
8,500 of job B. Clearly, waiting for job B to complete 8,500 of its operations before allowing 
job A to complete 12% of its operations is likely to produce an inefficient schedule. Thus, 
orienting a single disjunctive edge in the "wrong" direction is likely to prevent a schedule 
from being optimal, and so any particular edge will likely have a common orientation in all 
globally optimal schedules. 

In contrast, when j-oo, it is the workloads of the machines that become very long. 
The order in which the jobs are processed on a particular machine does not matter much as 
long as the machine with the longest workload is kept busy, and so the fact that a particular 
edge is oriented a particular way is unlikely to prevent a schedule from being optimal. All 
of this is formalized below. 

We will make use of the following well-known definition. 

Definition (whp). A sequence of events ^„ occurs with high pvohcbbility (whp) ?/limjj_^Qo 

= 1- 

Lemma 1 and Theorem 1 show that for constant N, a randomly chosen edge of a random 
by M JSP instance will be in the backbone whp (as M— t-oo). Lemma 2 and Theorem 2 
show that for constant M, a randomly chosen edge of a random A^ by M JSP instance will 
not be in the backbone whp (as N^co). 

Lemma 1. Let I he a random N by M JSP instance, and let S = S{tt,I) be the schedule 
for I obtained using some instance-independent priority rule ir. For an arbitrary job J & I, 
define Af = 5+(Jm) - r(J). Then EfAf] is 0{N). 

Proof. We assume N = 2 and M > 1. The generalization to larger A" is straightforward, 
while the cases N = 1 and M = 1 are trivial. Let / = {J^, J^} and let J = J^. 

Let T = (oi,02, . . . ,onm) be the sequence of operations selected from Ready (in line 
2(b) of the definition of a priority rule in §3.3) in constructing S. We say that an operation 
JI overlaps with an operation J? if 

1. J| appears before J} in T, and 

2. [5(J|), 5+(J|)] n [S+{JU), S+{JU) + r( J/)] / . 

If additionally m[Jl) = m,{Jj), we say that contends with Jj. Intuitively, if o = 
overlaps with o' = Jj then the start time of o might have been delayed because o's machine 
was being used by o' . If o contends with o' , then the start time of o actually was delayed. 
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Let 9ij (resp. dij) be an indicator for the event that j/ overlaps (resp. contends) with 
Jj. Let Ci = {Jj : 9ij = 1} be the set of operations in that overlaps with. Then 
|CinUi'>iCi.| < L Thus 

J2 \Ci\ = J2\^i\ U C^'l + E ^ U ^ . (4.1) 

i i i'>i i i'>i 

Let / = In,m-i be a random A?^ by M — 1 JSP instance, and define Oij, Sij, and Ci 
analogously to the above. Then for j < M — 1, 

P [eij = l\m{Jl) = m(j|)] = P [9i,j = 1] . 

This is true because ^[dij = 1] is a function of the joint distribution of the operations in 
the set {J^, : i' < i} Li {Jj, : j' < j}; and, as far as this joint distribution is concerned, 
conditioning on the event m{Jl) = m{Jj) is like deleting the operations that use the 
machine m{Jl). 

Thus E [5ij\ = P [Sij = 1] = ^P [Oij = l|m( J/) = m( J|)] = ^P [Oij = l] = [dij] . 
Therefore, 

Efii E^ii m,j] < 2 + m,] 

< 4 

where in the last step we have used (4.1). It follows that E[Aj] < ATmax {^max is the 
maximum operation duration defined in §3). When we consider arbitrary AT, we get E[Aj] < 

4.Tmax{N-l). 

□ 



As a corollary of Lemma 1, wc can show that a simple priority rule (ttq) almost surely 

N_ 
M 



generates an optimal schedule in the case — )■ 0. 



Definition (priority rule tto). Given an N by M JSP instance I, let k* = argmax^gj^y] 
t{J^) he the index of the longest job. The priority rule ttq first schedules the operations in 
, then schedules the remaining operations in a fixed order. 



MI, 4) 



i if k = k* 

Mk + i otherwise. 



Corollary 1. Let I be a random N by M JSP instance. Then for fixed N, it holds 
whp (as M ^ oo) that the schedule S = 5(7ro,i) is optimal and has makespan £{S) = 
maxfcgjAT] r(j'=). 

Proof. Define the priority rule vr^ by vr^(I, jf)=i\ik = k-, Mk + i otherwise. Then tt^ is 
instance-independent, and ttq is equivalent to vTfc*. Thus for any J & I we have 

E[A7] < "^E[Af] = 0{N^) 
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where we define Aj = Aj ' , and the second step uses Lemma 1. By Markov's inequal- 
ity, Ay < Mi VJ G / whp. By the Central Limit Theorem, each t(J) is asymptotically 
normally distributed with mean jjM and standard deviation o\fM. It follows that whp, 
r(J^*) - r(J^) > M3 V/c 7^ A:*. This implies 1{S) = t{J^*). Because r(J^*) is a lower 
bound on the makespan of any schedule, the corollary follows. □ 

Theorem 1. Let I he a random N by M JSP instance, and let e be a randomly selected 
element of E{I). Then for fixed N, it holds whp (as M^oo) that e G lJ)ackbone{I) . 

Proof. Let e = {Ji, Jj} with i < j and let a = {J'j, Ji). By Proposition 1 and Corollary 1, 
it suffices to show that whp, all disjunctive graphs containing a contain a path from to 
a* with weighted length > maxfcg[^] t{J^). 

3 

Assume j — i > Mi (this holds whp because both i and j are selected uniformly at 
random from [M]), and consider the path 

P = {o^, J[, J2, ■ ■ ■ , Jj, J'i, Ji+i, • • • , Jm , o*) 

which passes through |P| > 3+Af+M i vertices and has weighted length w{P). We want 
to show that w{P) > maxj^j t{J) whp. By the Central Limit Theorem, (1) for any fixed 
i and j, w{P) is asymptotically normally distributed with mean fi{\P\ — 2) and standard 
deviation c-y/ {\P\ — 2) and (2) for each J, r(J) is asymptotically normally distributed 
with mean /iM and standard deviation a\fM. That w{P) > maxjg/ t( J) whp follows by 
Chebyshev's inequality. 

□ 

Lemma 2 shows that as — )■ 00, a simple priority rule (tToo) almost surely generates 
a schedule in which no machine is idle until all the operations performed on that machine 
have been completed (a schedule with this property is clearly optimal). 

Definition (priority rule tToo). Given an N by M JSP instance I, the priority rule tt^o 

first schedules the first operation of each job (taking the jobs in order of ascending indices), 
then the second operation of each job, and so forth. It is defined by 7roo(/, J^) = iN + k. 

Lemma 2. Let I be a random N by M JSP instance. Then for fixed M, it holds whp (as 
N ^ 00) that the schedule S = 5(7roo,/) has the property that 

S{o) = S+{Mio)) Vo G ops{I) . 

Proof. Suppose that when executing tToo we replace the line S(o) max(5"'"(J'(o)), 
S'^{M{o))) (line 2(c) in the definition of apriority rule given in §3.3) with 5(0) <— S'^{M{o)). 
If the resulting S is feasible then the replacement must have had no effect. Thus it suffices 
to show that the resulting S is feasible whp. Equivalcntly, we want to show that whp, 
S{o) > S^{J'{o)) Vo G ops{I) when S is constructed using the modified version of line 2 
(c). 

Let ops'^~^{I) = {J^ G ops{I) : > 1} be the set of operations that are not first in their 
job. It suffices to show that S{o) — S{J{o)) > Tmax Vo G ops'^~^{I). To this end, consider 
an arbitrary operation o = Jf & ops"^^ (/) . Under tToo , the number of operations with lower 
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priority than o is {i — 1)N + (A; — 1). The number of operations that have lower priority 
than jf' and run on machine m{o) is, in expectation, equal to jj [{i — 1){N — 1) + {k — 1)] 
(where the switch from A/" to A'^ — 1 is due to the fact that o is the only operation in job J*^ 
that uses machine m(o)). It follows that 

ns{o)] = ^[{i-i){N-i) + {k-i)] 

so that 

E[S{o) - S{J{o))] = E[S{Jh - S{Jti)] = . 
In Appendix A we use a martingale tail inequality to establish the following claim. 
Claim 2.1. With high probability, for all o G ops'^~^{I) we have 

S{o)-S{J{o))>lE[Sio)-S{J{o))] . 



The Lemma then follows from the fact that ^E[S'(o) — S{J'{o))] > Tmax for ^ sufficiently 
large. □ 

Based on the results of computational experiments, Taillard (1994) conjectured that as 
^ — )• oo the optimal makespan is almost surely equal to the maximum machine workload. 
The following corollary of Lemma 2 confirms this conjecture. 

Corollary 2. Let I be a random N by M JSP instance with optimal makespan ^min{I)- 
Let T{fh) = r({o G ops{I) : m{o) = m}) denote the workload of machine m. Then for fixed 
M , it holds whp (as N—^oo) that imin{I) = ma^melM] T{fh). 

Theorem 2. Let I be a random N by M JSP instance, and let e be a randomly selected 
element of E{I). Then for fixed M , it holds whp (as N—^oo) that e ^ l-backbone{I). 

Proof. Let e = { Jj, Jj}. Remove both J and J' from / to create an — 2 by Af instance 
/, which comes from the same distribution as a random — 2 by M JSP instance. Lemma 
2 shows that whp there exists an optimal schedule S for / with the property described in 
the statement of the lemma. 

Let r(m) = t({o € ops (I) : m[o) = fh}) denote the workload of machine fh in the 
instance I. By the Central Limit Theorem, each r(m) is asymptotically normally distributed 
with mean /x(A/"— 2) and standard deviation a\/N — 2. It follows that whp, |r(?fi) — T(?fi')| > 
Ni \/fh 7^ fh! . 

Thus whp there will be only one machine still processing operations during the interval 
[i{S) - Ni,l{S)]. Because max(r(J), r( J')) < Mr^ax = 0(1), we can use this interval 
to construct optimal schedules containing the disjunctive arc {Ji,Jj) as well as optimal 
schedules containing the disjunctive arc (J'-, Ji). □ 
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5. Clustering as a Function of Meikespan 

In this section we estimate the expected distance between random schedules whose makespan 
is within a factor p of optimal, as a function of p for various combinations of N and M. We 
then examine how the shape of this curve changes as a function of ^ . More formally, if 

• / is a random N by M JSP instance with optimal makespan imin{I), 
. P-opt{I) = {S : e{S) < p • and 

• Si and S2 are drawn independently at random from p_opt{I), 
we wish to compute EfHSf - 

Note that the experiments of §4 provide an upper bound on this quantity: 

E [ll^^* - 5^11] < - E [\p-backbone\] 

but provide no lower bound (a low backbone size is not evidence that the mean distance 
between global optima is large). The experiments in this section can be viewed as a test of 
the degree to which the upper bound provided by §4 is tight. 

5.1 Methodology 

We generate "random" samples from p_opt{I) by running the simulated annealing algorithm 
of van Laarhoven et al. (1992) until it finds such a schedule. More precisely, our procedure 
for sampling distances is as follows. 

1. Generate a random N by M JSP instance /. 

2. Using the branch and bound algorithm of Brucker et al. (1994), determine the optimal 
makespan of /. 

3. Perform k runs, i?2, • • • , Rk, of the van Laarhoven et al. (1992) simulated annealing 
algorithm. Restart each run as many times as necessary for it to find a schedule whose 
makespan is optimal. 

4. For each p G {1, 1.01, 1.02, . . . , 1.5}, find the first schedule, call it Si{p), in each run 
Ri whose makespan is within a factor p of optimal. For each of the (2) pairs of 
runs {Ri,Rj), add the distance between Si{p) and Sj{p) to the sample of distances 
associated with p. 

We ran this procedure on random JSP instances for the same 7 combinations of and 
M that were used in §4.2. For the smallest instance sizes for each ratio (i.e., 6x6, 8x4 and 
9x3 instances) we generate 100 random JSP instances and run the procedure with k = 100. 
Setting k = 100 allows us to measure the variation in instance- specific expected values. For 
the other 4 combinations of N and M, performing 10,000 simulated annealing runs is too 
computationally expensive, so we instead generate 1000 random JSP instances and run the 
procedure with k = 2. 
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Figure 5 (A), (B), and (C) plot the expected distance between random p-optimal sched- 
ules as a function of p for each of the three values of jj. Figure 5 (D) shows the 0.75 
and 0.25 quantiles of the 100 instance- specific sample means for each of the three smallest 
instance sizes. Examining Figure 5 (D), we see that the variation among random instances 
with the same N and M is small relative to the differences between the curves for different 
values of jj. 

5.2 Discussion 

By examining Figure 5 we see that for any p, the expected distance between random p- 
optimal schedules increases as ^ increases. Indeed, global optima are dispersed widely 
throughout the search space for ^ = 3, and this is true to a lesser extent for ^ = 2. 

An immediate implication of Figure 5 is that whether or not they exhibit the two 
correlations that are the operational definition of a big valley, typical landscapes for JSP 
instances with ^ = 3 cannot be expected to be big valleys in the sense of having a central 
cluster of optimal or near-optimal solutions. If anything, one might posit the existence of 
multiple big valleys, each leading to a separate global optimum. The next section expands 
upon these observations. 

6. The Big Valley 

In this section we define some formal properties of a big valley landscape, conduct experi- 
ments to determine the extent to which random JSP instances exhibit these properties as 

we vary and present analytical results for the limiting cases ^ — and ^ — t- oo. 

Considering again the "intuitive picture" given in Figure 1, we take the following to be 
necessary (though perhaps not sufficient) conditions for a function f{x) to be a big valley. 

1. Small improving moves. If x is not a global minimum of /, there must exist a nearby 
x' with f{x') < f{x). 

2. Clustering of global optima. The maximum distance between any two global minima 
of / is small. 

Note that there is no direct relationship between these two properties and the cost-distance 
correlations considered by Boese et al. (1994). 

6.1 Formalization 

The following four definitions allow us to formalize the notion of a big valley landscape. 

Definition (Neighborhood A/",). Let I be an arbitrary JSP instance, and let U be the set 
of all schedules for I. Let r be a positive integer. The neighborhood Mr : U ^ 2^ is defined 
by 

Mr{S) = {S' eU: \\S - S'W < r} . 

Definition (local optimum C{S,J\f)). Let I and U be as above; let J\f : U ^ 2^ be 

an arbitrary neighborhood function; and let S be a schedule for J. C{S,M) is the schedule 
returned by the following procedure (which finds a local optimum by performing next-descent 
starting from S using the neighborhood M ) . 
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ure 5: Expected distance between random schedules within a factor p of optimal, as a 
function of p. Graphs (A), (B), and (C) depict curves for random instances with 
= 1, 2, and 3, respectively. Graph (D) compares the curves depicted in (A), 
(B), and (C) (only the curves for the smallest instance sizes are shown in (D)). In 
(D), top and bottom error bars represent 0.75 and 0.25 quantiles (respectively) 
of instance- specific sample means. 
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Figure 6: Two landscapes comprised of (r, (5)-valleys. (A) is a single (r, S) valley (for the 

values of r and 6 shown in the figure), while (B) can either be viewed as three 
distinct (r, S) valleys or as a single (r, 5')-valley. (The values of r shown in the 
figure are slightly larger than necessary.) 



1. Let M{S) = {Si, 5*2, . . . , S'|jV'(5)|} (where the elements of Af{S) are indexed in a fixed 
but arbitrary manner). 

2. Find the least i such that £{Si) < £{S). If no such i exists, return S; otherwise set 
S Si and go to 1. 

Definition ((r, 5)- valley). Let I and U be as above, and let r and S be non-negative 
integers. A set V C. U is an (r, (5)-valley if V has the following two properties. 

1. For any S E V, the schedule jC{S,Mr) is in V and is globally optimal. 

2. For any two globally optimal schedules Si and S2 that are both in V, \\Si — S2\\ < S. 

Figure 6 illustrates the definition of an (r, (5)-valley. We would say that the landscape 
depicted in Figure 6 (A) is a big valley, while that depicted in 6 (B) is comprised of three 
big valleys. 

Definition ((r, 5,p) landscape). Let I and U be as above, and let S be a random schedule 
for I. Then I has an (r, 5,p) landscape if there exists a V C. U such that 

1. V is an {r, 6) -valley, and 

2. F[S eV]>p. 

Any JSP instance trivially has an (M(^) , M(^) , 1) landscape (because if r = M(J^^ 
then J\fr includes all possible schedules). If a JSP instance / has an (r, M(^), 1) landscape, 
then a globally optimal schedule for / can always be found by starting at a random schedule 
and applying next-descent using the neighborhood Afr. 

We say that a JSP instance / has a big valley landscape if / has an (r, 5,p) landscape for 
small r and S in combination with p near 1. In contrast, if we have small r in combination 
with p near 1 but require large 6, we say that the landscape consists of multiple big valleys. 
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6.2 Neighborhood Exactness 

In this section we seek to determine the extent to which random JSP instances have the 
"small improving moves" property. We require the following definition. 

Definition (neighborhood exactness). Let I, U , and M he as above, and let S he a 
random schedule for I. The exactness of the neighborhood Af on the instance I is the 
prohahility that JC{S,M) is a global optimum. 

If the exactness of A^r is p, then / has an (r, M(^),p) landscape (let V consist of all 
schedules S such that C{S, Af) is a global optimum). We will estimate the expected exactness 
of Mr as a function of r for various combinations of N and M. By examining the resulting 
curves, we will be able to draw conclusions about the extent to which the landscapes of a 
random A'^ by M JSP instance typically has the "small improving moves" property. We can 
then determine how the presence or absence of this property depends on ^. 

For fixed N and M, we compute the expected exactness of Mr for 1 < r < M(^) by 
repeatedly executing the following procedure. 

1. Generate a random N by M JSP instance /. 

2. Using the algorithm of Brucker et al. (1994), compute the optimal makespan of /. 

3. Repeat k times: 

(a) a random feasible schedule, r •(— 1, opt false. 

(b) While opt = false do: 

• S ^C{S,Mr). 

• If iS is a global optimum, opt true. 

• Record the pair (r, opt) . 

• r r + 1. 

(c) For all r' such that r < r' < M{^) record the pair (r', true). 

The pairs recorded by the procedure (in step 3(c) and the third bullet point of 3 (b)) 
are used in the obvious way to estimate expected exactness. Specifically, for each r the 
estimated expected exactness of Mr is the fraction of pairs (r, x) for which x = true. 

The implementation of the first bullet point in step 3 (b) deserves further discussion. To 
determine C{S,Mr), each step of next-descent must be able to determine the best schedule 
in {S' : 1 1 5 — 5' 1 1 < r}. For large r it is impractical to do this by brute force. Instead we have 
developed a "radius-limited" branch and bound algorithm that, given an arbitrary center 
schedule Sc and radius r, finds the schedule aigmini^g, . ^^^^.y i{S'). Our radius-limited 
branch and bound algorithm uses the branching rule of Balas (1969) combined with the 
lower bounds and branch ordering heuristic of Brucker et al. (1994). 
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6.3 Results 

We use three combinations of N and M with ^ = g (3x15, 4x20, and 5x25 instances), three 
combinations with ^ = 1 (6x6, 7x7, and 8x8 instances) and two combinations with ^ = 5 

(15x3 and 20x4 instances). For the smallest instance sizes for each ratio (i.e., 3x15, 6x6, 
and 15x3 instances) we generate 100 random JSP instances and run the above procedure 
with k = 100. Otherwise, we generate 1000 random JSP instances and run the procedure 
with k = 1. 

Figure 7 (A), (B), and (C) plot expected exactness as a function of neighborhood radius 
(normalized by the number of disjunctive edges) for each of these three values of Figure 
7 (D) shows the 0.75 and 0.25 quantiles of the 100 instance- specific sample means for each 
of the three smallest instance sizes. 

6.4 Discussion 

Examining Figure 7, we see that for any normalized neighborhood radius, the neighborhood 
exactness is lowest for instances with ^ = 1 and higher for the two more extreme ratios 
(^ = I and ^ = 5). If we view neighborhood exactness as measuring the "smoothness" 
of a landscape, the data suggest that typical JSP landscapes are least smooth at some 
intermediate value of but become more smooth as ^ ^ or ^ — >■ oo. This in 
itself suggests an easy-hard-easy pattern of typical-case instance difficulty in the JSP, a 
phenomenon explored more fully in the next section. 

Using the methodology of §§4-5, we found that the expected proportions of backbone 
edges for 3x15, 4x20, and 5x25 instances are 0.94, 0.93, and 0.92, respectively, while the 
expected distance between global optima was 0.02 in all three cases. In contrast, the 
expected proportions of backbone edges for 15x3 and 20x4 instances are near-zero, while 
the expected distances between global optima are 0.33 and 0.28, respectively. We conclude 
that landcapes of random by M JSP instances typically have the "clustering of global 
optima" property for ^ = | but not for ^ = 5. However, Figure 7 suggests that the "small 
improving moves" property is present for both = g and ^ = 5. Accordingly, we would 
say that typical landscapes for ^ = g are big valleys, while for ^ = 5 the landscape is 
comprised of many big valleys rather than just one. 

The data from §§4-5 show that for ^ = 1, typical landscapes have the "clustering of 
global optima" property. Examining Figure 7 (B), we see that we are able to descend from 
a random schedule to a globally optimal schedule with probability ^ when the (normalized) 
neighborhood radius is about 6%. For this reason, we think of the landscapes of random 
JSP instances with = 1 as having the "small improving moves" property to some extent. 
This, in combination with the curve in Figure 5 (A) (which shows expected distance between 
random p-optimal schedules as a function of p) leads us to say that typical landscapes of 
random JSP instances with ^ = 1 can still be roughly described as big valleys. However, 
the valley is much rougher (meaning that larger steps are required to move from a random 
schedule to a global optimum via a sequence of improving moves) than for the more extreme 
values of 

Table 1 summarizes the empirical findings just discussed. 
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Figure 7: Expected exactness of Mr as a function of the (normalized) neighborhood radius 
r. Graphs (A), (B), and (C) depict curves for random instances with ^ = |, 
1, and 5, respectively. Graph (D) compares the curves depicted in (A), (B), and 
(C) (only the curves for the largest instances are shown in (D)). In (D), top and 
bottom error bars represent 0.75 and 0.25 quantiles (respectively) of instance- 
specific exactness. 
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Table 1 . Landscape attributes for three values of . 



N 
M 


Clustering of 
global optima? 


Small improving 
moves? 


Description 


1 
5 


Yes 


Yes 


Big valley 


1 


Yes 


Somewhat 


(Rough) big valley 


5 


No 


Yes 


Multiple big valleys 



6.5 Analysis 

We first establish the behavior of the curves depicted in Figure 7 in the limiting cases 
^ ^ and — >■ oo. We then use these results to characterize the landscapes of random 
JSP instances using the {r,5,p) notation introduced in §6.1. 

The following two lemmas show that as ^ — )• (resp. ^ oo), a random schedule 
will almost surely be "close" to an optimal schedule. The proofs are given in Appendix A. 

Lemma 3. Let I be a random N by M JSP instance, and let S be a random schedule for 
I. Let S be an optimal schedule for I such that \\S — S\\ is minimal. Let f{M) be any 
unbounded, increasing function of M. Then for fixed N, it holds whp (as M oo) that 
\\S-S\\<f{M). 

Lemma 4. Let I be a random N by M JSP instance, let S be a random schedule for I, 
and let S be an optimal schedule for I such that \\S — S\\ is minimal. Then for fixed M and 
€>0, it holds whp (as N ^ oo) that \\S - S\\ < N'^+^. 

The following are immediate corollaries of Lemmas 3 and 4. 

Corollary 3. For fixed N, the expected exactness ofMff^M) approaches 1 as M ^ oo, where 
f{M) is any unbounded, increasing function of M. 

Corollary 4. For fixed M and e > 0, the expected exactness of M^i+e approaches 1 as 
N ^ oo. 

Because the total number of disjunctive edges is M(^), these two corollaries imply that 
as — )■ (resp. ^ — > oo), the curve depicted in Figure 7 approaches a horizontal line at 
a height of 1. 

Using Lemmas 3 and 4, Theorems 3 and 4 characterize the landscape of random JSP 
instances using the {r,6,p) notation of §6.1. Before presenting these theorems, a slight 
disclaimer is in order. Lemmas 3 and 4 (the proofs of which are fairly involved) indicate 
that in the extreme cases ^ ^ and — >■ oo we can jump from a random schedule to a 
globally optimal schedule via a single small move. We strongly believe that in these cases 
it is also possible to go from a random schedule to a global optimum by a sequence of many 
(smaller) improving moves, although proving this seems difficult. Nevertheless, it should 
be understood that our theoretical results do not strictly imply the existence of landscapes 
like those depicted in Figure 6 (where for most starting points there is a sequence of two or 
more small improving moves leading to a global optimum). 

Theorem 3 shows that as — t- 0, a random JSP instance almost certainly has an 
{r,6,p) landscape where r grows arbitrarily slowly as a function of M, S is o(M(^)), and 
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p is arbitrarily close to 1. In other words, as — )■ the landscape has both the "small 
improving movc(s)" property and the "clustering of global optima" property. In contrast, 
Theorem 4 shows that as ^ — t- cxd, a random JSP instance almost surely does not have an 
{r,5,p) landscape unless S is r2(iV^). Instead, the landscape contains Q{N\) (r, l)-valleys, 
where r is o(M(^)). Thus, as ^ ^ oo, the landscape has the "small improving move(s)" 
property but not the "clustering of global optima" property. These analytical results confirm 
the trend suggested by Figure 7 and discussed in §6.4. 

Theorem 3. Let I be a random N by M JSP instance. Let f{M) be any unbounded, 
increasing function of M . For fixed N and e > 0, it holds whp (as M — >■ oo) that L has a 
(r, 6,p) landscape for r = f{M), 5 = eM{^^ and p = 1 — e. 

Proof. Let V be the set of all schedules S such that /^(S", A/", ) is a global optimum. It follows 
by Corollary 3 that whp, the exactness of I on r is at least p, which means S & V with 
probability at least p. It remains to show that V is an (r, 5)-valley whp. Part 1 of the 
definition of an (r, (5)-valley is satisfied by the definition of V. Part 2 follows from Theorem 
L □ 



Theorem 4. Let L be a random N by M JSP instance, and let S be a random schedule for 
L. There exists a set V{L) = Uf^iVi of schedules for L such that for fixed M and e > 0, V 
has the following properties whp: 

1. Se V; 

2. Vi is an {r,d)-vaUey with r = iV^+^ and (5 = 1 Vi G [n]; 

3. n> N\{1- e); and 

4- ma^{SuS2}cv \\Si - S2\\ > n{N^). 

Proof. Let {^i, S2, ■ ■ ■ , Sn} be the set of globally optimal schedules for /, and define Vi = 
{S : C{S,M^~^'^) = Si}. Property 1 holds whp by Lemma 4. Property 2 holds by definition 
oiVi. 

The fact that property 3 holds whp is a consequence of Lemma 2. Recall that Lemma 

2 showed that as ^ — )■ 00, the priority rule tToo generates an optimal schedule whp, where 
TTooil, Jf) = iN + k. Because the indices assigned to the jobs are arbitrary, Lemma 2 also 
applies to the priority rule tt'^{I, Ji) = iN + (l){k), where cj) is any permutation of [A^]. There 
are N\ possible choices of 4>. Let / be the number of choices that fail to yield a globally 
optimal schedule. Property 3 can only fail to hold if / > eA^!. But by Lemma 1, 1E[/] is 
o{l)N]; hence / < eA^! whp by Markov's inequality. 

To establish property 4, choose permutations (pi and (p2 that list the elements of [A^] 
in reverse order (i.e., = 4>2{N — i) ^i e [N]). By Lemma 2, the schedules Si = 

S{'it'^^,I) and ^2 = S{'k^'^,I) are both globally optimal whp. But for any disjunctive edge 
e = {Ji, J(} we must have e{Si) 7^ e(5'2), hence \\Si — S2\\ > \{{J,J'} ^ I '■ rn,{Ji) = 
m{J'i)}\ > (^^" ) = 0(Af2), where we obtain the expression ) using the pigeonhole 

principle. □ 
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7. Quality of Random Schedules 
7.1 Methodology 

In this section we examine how the quahty of randomly generated schedules changes as a 
function of the job:machine ratio. Specifically, for various combinations of N and M, we 
estimate the expected value of the following four quantities: 

(A) the makespan of a random schedule, 

(B) the makespan of a locally optimal schedule obtained by starting at a random schedule 
and applying next-descent using the A/i move operator, 

(C) the makespan of an optimal schedule, and 

(D) the lower bound on the makespan of an optimal schedule given by the maximum of 
the maximum job duration and the maximum machine workload: 

max maxr(J), max > t(o) 

' Jei "me[M] f-^ ' ' 

oEops(l ):m(o)=m 

The values of ^ considered in our experiments are those in the set i? = {^, |, |, ^. 



... 4' 

|, |, |, 1, |, 2, 3, 4, 5, 6, 7 }. Wc consider all combinations of and M in the set 



S = U^ei? Sr, where Sr = {{N, M) : ^ = r, mm{N, M) > 2, max(iV, M) > 6, NM < 1000}. 
For each {N,M) G S, we estimate the expected value of (A) (resp. (B)) by generating 100 
random N hj M JSP instances and, for each instance, generating 100 random schedules 
(resp. local optima). Wc estimate (D) by generating 1000 random JSP instances for each 
(A'', M) G S. For some combinations {N, M) G S^r,,,// ^ S, it was also practical to compute 
quantity (C). Let Ur = \S small H 5r| be the number of combinations (A^, M) with ^ = r for 
which we computed (C). We chose S small so that > 4 for r 7^ | while ns = 3. For each 
{N,M) G Ssmalli we estimate (C) using 1000 random JSP instances. 



7.2 Results 

Figure 8 plots the mean values of (A), (B), and (C). respectively, against the mean value of 
(D), for various combinations of and M. The data points for each combination of A^ and 
M are assigned a symbol based on the value of j^. Top and bottom error bars represent 0.75 
and 0.25 quantiles (respectively) of instance- specific sample means. Note that the width of 
these error bars is small relative to the differences between the curves for different values 
of ^ 

Examining Figure 8, we see that the set of data points for each value of ^ are approxi- 
mately (though not exactly) collinear. Furthermore, in all three graphs the slope of the line 
formed by the data points with = r is maximized when r = 1, and decreases as r gets 
further away from 1 (see also Figure 9 (A)). 

To further investigate this trend, we performed least squares linear regression on the set 
of data points for each value of ^ . The slopes of the resulting lines are shown as a function 
of ^ in Figure 9 (A). 

From examination of Figure 9 (A) , it is apparent that 
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(A) Random schedules 
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Mean lower bound 



(B) Random local optima 
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(C) Optimal schedules 
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Figure 8: Expected makespan of (A) random schedules, (B) random local optima, and (C) 
optimal schedules vs. expected lower bound, for various combinations of N and 
M (grouped by symbol according to ^). Top and bottom error bars represent 
0.75 and 0.25 quantiles (respectively) of instance-specific sample means. 
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(A) Results of least squares regression 
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Figure 9: (A) graphs the slope of the least squares fits to the data in Figure 8 (A), (B), 
and (C) as a function of ^ (includes values of not depicted in Figure 8). (B) 
graphs the number of search tree nodes (90*'* percentile) used by the branch and 
bound algorithm of Brucker et al. (1994) to find an optimal schedule. 
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• as the value of becomes more extreme (i.e., approaches either or oo), the ex- 
pected makespan of random schedules (resp. random local optima) comes closer to 
the expected value of the lower bound on makespan; and 

• the difference between the expected makespan of random schedules (resp. random 
local optima) and the expected value of the lower bound on makespan is maximized 
at a value of ^ 1. 

The first of these two observations suggests that as ^ approaches either or oo, a 
random schedule is almost certainly near-optimal. §7.3 contains two theorems that confirm 
this. 

The second of these two observations suggests that the expected difference between the 
makespan of a random schedule and the makespan of an optimal schedule is maximized at a 
value of ^ somewhere in the neighborhood of 1. This observation is particularly interesting 
in light of the empirical fact that square instances of the JSP (i.e., those with ^ = 1) are 
harder to solve than rectangular ones (Fisher &; Thompson, 1963). 

Figure 9 (B) graphs the number of search tree nodes (90*'' percentile) required by the 
branch and bound algorithm of Brucker et al. (1994) to optimally solve random N by M 
instances, as a function of the log (base 10) of search space size. We take the size of the 
search space for an N by M JSP instance to be the number of possible disjunctive graphs, 
namely 2'" V 2 ; . Note that some of these disjunctive graphs contain cycles and therefore do 
not correspond to feasible schedules, so this expression overestimates the size of the search 
space. Data points are given for each combination of N and M for which we could afford to 
run branch and bound (i.e., each combination of and M for which we computed quantity 
(C)). The data points are grouped into curves according to 

Examining Figure 9 (B), we see that the curves are steepest for the ratios |, 1, |, 2, 
and 3, and that the curves arc substantially less steep for extreme values of ^ such as ^ 
and 7. Thus, at least from the point of view of this particular branch and bound algorithm, 
random JSP instances exhibit an "easy-hard-easy" pattern of instance difficulty. We discuss 
this pattern further in §7.4. 

7.3 Analysis 

The following two theorems show that, as ^ approaches either or 00, a random schedule 
will almost surely be near-optimal. 

Theorem 5. Let I be a random N by M JSP instance with optimal makespan lmin{I) md 
let S be a random schedule for I. Then for fixed N and e > 0, it holds whp (as M ^ 00) 



Proof. The priority rule iTrand associates a priority with each operation o G ops {I). Let 

the sequence T contain the elements of ops{I), sorted in ascending order of priority. The 
schedule S = S{TTrand-, I) depends only on T, and there are NM\ possible choices of T. Thus 
TTrand Can be Seen as choosing at random from a set of NM\ instance-independent priority 
rules. Because each of these instance- independent priority rules is subject to Lemma 1, iTrand 
is also subject to Lemma 1 and thus for each J, IE[Aj] is 0{N). Thus E[^(S') — lmin{I)\ ^ 



X;jIE[A5] = 0(iV2), so 1{S) - £min{I) does not exceed elmin{I) = ^{M) whp by Markov's 



that£{S) < {l + e)imin{I)- 




□ 
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Theorem 6. Let I be a random, N by M JSP instance with optimal makespan imin{I) cmd 
let S be a random schedule for I. Then for fixed M and e > 0, it holds whp (as N oo) 
that £{S) < {l + e)imin{I). 

Proof. See Appendix A. □ 

The idea behind the proof of Theorem 6 is the following. As shown in Lemma 2, the 
priority rule tToo almost surely generates an optimal schedule. The relevant property of tToo 
was that, when the operations were sorted in order of ascending priority, the number of 
operations in between v7(o) and o was Q{N). The key to the proof of Theorem 6 is that in 
expectation, iTrand shares this property for most of the operations o G ops(I). 

7.4 Easy-hard-easy Pattern of Instance Difficulty 

The proofs of Corollary 1 (resp. Lemma 2) show that as ^ — >■ (resp. ^ — >■ oo) there exist 
simple priority rules that almost surely produce an optimal schedule. Moreover, Theorems 
5 and 6 show that in these two limiting cases, even a random schedule will almost surely 
have makespan that is very close to optimal. Thus, both as ^ and as ^ — >■ oo, almost 
all JSP instances are "easy" . 

In contrast, for ~ 1, Figure 9 (A) suggests that random schedules (as well as random 
local optima) are far from optimal. The literature on the JSP (as well the results depicted 
in Figure 9 (B)) attests to the fact that random JSP instances with ^ 1 are "hard". 
Thus we conjecture that, as in 3-SAT, typical instance difficulty in the JSP follows an "easy- 
hard-easy" pattern as a function of a certain parameter. In contrast to 3-SAT, the "easy- 
hard-casy" pattern in the JSP is not (to our knowledge) associated with a phase transition 
(i.e., we have not identified a quantity that undergoes a sharp threshold at ^ ^ 1). 

Furthermore, although the empirical results in Figures 9 (A) and (B) support the idea 
that typical-case instance difficulty in the JSP follows as "easy-hard-easy" pattern, we 
do not claim to have isolated any particular value of as being the point of maximum 
difficulty. As shown in Figure 9 (B), random JSP by M JSP instances are most difficult 
for the branch and bound algorithm of Brucker et. al (1994) when ^ Ri 2, but this may not 
be true of other branch and bound algorithms or of JSP heuristics based on local search. 
We leave the task of characterizing the "easy-hard-easy" pattern more precisely as future 
work. 

In related work. Beck (1997) studied a constraint-satisfaction (as opposed to makespan- 

minimization) version of the JSP, and gave empirical evidence that the probability that a 
random JSP instance is satisfiable undergoes a sharp threshold as a function of a quantity 
called the constrainedness of the instance. 

8. Limitations and Extensions 

The primary limitation of the work reported in this paper is that both our theoretical and 
empirical results apply only to random instances of the job shop scheduling problem. There 
is no guarantee that our observations will generalize to instances drawn from distributions 
with more interesting structure (Watson ct al., 2002). The difficulty in extending our 
analysis to other distributions is that analytical results similar to the ones presented in 
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this paper may become much more difficult to derive. However, there arc at least three 
distributions that have been studied in the scheduling literature for which we believe it 
should be not too difficult to adapt our proofs (the conclusions may change as part of the 
adaptation process). 

• Random workflow JSP instances. In a workflow JSP instance, the set of machines 
is partitioned into sets (say M-i,M.2, ■ ■ ■ ,M.k)- For i < j, each job must use all the 
machines in Mi before using any machines in Mj. Mattfeld et al. (1999) define 
a random distribution over workflow JSPs which generalizes in a natural way the 
distribution defined in §3.3 (the difference is that the permutations 0i, 02, • • • , (pN are 
chosen uniformly at random from the set of permutations that satisfy the workflow 
constraints). 

• Random instances of the (permutation) flow shop scheduling problem. An instance of 
the flow shop scheduling problem (FSP) is a JSP instance in which all jobs use the 
machines in the same order (equivalently, a FSP instance is a workflow JSP instance 
with k = M). The perm,uta,tion flow shop problem (PFSP) is a special case of the FSP 
in which, additionally, each machine must process the jobs in the same order. There 
is a large literature on the (P)FSP; Framinan et al. (2004) and Hejazi and Saghafian 
(2005) provide relevant surveys. 

• J oh- correlated and machine- correlated JSP instances. In a job-correlated JSP instance, 
the distribution from which operation durations are drawn depends on the job to which 
an operation belongs. Similarly, in machine-correlated JSP instance the distribution 
depends on the machine on which the operation is performed. Watson et al. (2002) 
have studied job-correlated and machine-correlated instances of the PFSP. 

Regarding the difficulty of instances drawn from these three distributions, computational 
experience shows that (i) random workflow JSPs are harder than random JSPs; (ii) ran- 
dom PFSPs are easier than random JSPs; and (iii) job-correlated and machine-correlated 
PFSPs are easier than random PFSPs. Extending our theoretical analysis to each of these 
distributions may give some insight into the relevant differences between them. 

8.1 The Big Valley vs. Cost-Distance Correlations 

In §6, we defined a "big valley" landscape as one that exhibits two properties: "small im- 
proving moves" and "clustering of global optima" . Our analytical and experimental results 
were based on this definition. Although we believe this definition captures properties of JSP 
landscapes that are important for designers of heuristics to understand, other properties 
(e.g., cost-distance correlations) are likely to be important as well. In particular, it may be 
possible for algorithms to exploit cost-distance correlations on landscapes that have neither 
the "small improving moves" nor the "clustering of global optima" properties. 

In the existing literature, the term "big valley" is used amorphously to mean either 
(1) a landscape like that depicted in Figure 1 or (2) a landscape that exhibits high cost- 
distance correlations. By making a sharper distinction between these two distinct concepts, 
we can only improve our understanding of JSP landscapes as well as the landscapes of other 
combinatorial problems. 
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9. Conclusions 

9.1 Summary of Experimental Results 

Empirically, we demonstrated that for low values of the job to machine ratio (^), low- 
makespan schedules are clustered in a small region of the search space and the backbone 
size is high. As ^ increases, low-makespan schedules become dispersed throughout the 
search space and the backbone vanishes. As a function of the "smoothness" of the 
landscape (as measured by a statistic called neighborhood exactness) starts out small for 
low values of ^ (e.g., ^ = |), is relatively high for ^ 1, and becomes small again for 
high values of (e.g., = 5). For both extremely low and extremely high values of 
the expected makespan of random schedules comes very close to that of optimal schedules. 
The quality of random schedules (resp. random local optima) appears to be the worst at a 
value of ^ 1. 

§6.4 discussed the implications of our results for the "big valley" picture of JSP search 
landscapes. For ~ 1, we concluded that a typical landscape can be described as a big 
valley, while for larger values of ^ (e.g., > 3) there are many big valleys. §7.4 discussed 
how our data support the idea that JSP instance difficulty exhibits an "easy-hard-easy" 
pattern as a function of 

9.2 Summary of Theoretical Results 

Table 2 shows the asymptotic expected values of various attributes of a random N hy M 
JSP instance in the limiting cases ^ — > and — )• oo. 



Table 2. Attributes of random JSP instances. 





Fixed N, M ^ oo 


Fixed M, N ^ oo 


Optimum makespan 


Max. job length 
(Corollary 1) 


Max. machine workload 
(Corollary 2) 


Normalized backbone size 


1 (Theorem 1) 


(Theorem 2) 


Normalized maximum distance be- 
tween global optima 


(Theorem 1) 


(Theorem 4) 


Normalized distance between random 
schedule and nearest global optimum 


(Lemma 3) 


(Lemma 4) 


Ratio of makespan of random schedule 
to optimum makespan 


1 (Theorem 5) 


1 (Theorem 6) 



9.3 Rules of Thumb for Designing JSP Heuristics 

Though we do not claim to have any deep insights into how to solve random instances of 
the JSP, our results suggest two general rules of thumb: 



• when is low (say, ~ 1 or lower), an algorithm should attempt to locate the 
cluster of global optima and exploit it; while 
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• when ^ is high (say, ^ > 3) an algorithm should attempt to isolate one or more 
clusters of global optima and deal separately with each of them. 

We briefly discuss these ideas in relation to two recent algorithms: backbone- guided local 
search (Zhang, 2004) and z-TSAB (Nowicki & Smutnicki, 2005). 

9.3.1 Backbone-guided local search 

Several recent algorithms attempt to use backbone information to bias the move opera- 
tor employed by local search. For example, Zhang (2004) describes an approach called 

backbone-guided local search in which the frequency with which an attribute (e.g., an as- 
signment of a particular value to a particular variable in a Boolean formula) appears in 
random local optima is used as a proxy for the frequency with which the attribute ap- 
pears in global optima. The approach improved the performance of the WalkSAT algorithm 
(Selman, Kautz, & Cohen, 1994) on large instances from SATLIB (Hoos & Stiitzlc, 2000). 
A similar algorithm has been successfully applied to the TSP (Zhang &: Looks, 2005) to 
improve the performance of an iterated Lin-Kernighan algorithm (Martin, Otto, & Felten, 
1991). Zhang writes: 

This method is built upon the following working hypothesis: On a problem 
whose optimal and near optimal solutions form a cluster, if a local search al- 
gorithm can reach close vicinities of such solutions, the algorithm is effective 
in finding some information of the solution structures, backbone in particular. 
(Zhang, 2004, p. 3) 

Based on the results of §§4-5, this working hypothesis is satisfied for random JSPs with 
^ ~ 1 or lower. It seems plausible that backbone-guided local search could be used to boost 
the performance of early local search heuristics for the JSP such as those of van Laarhoven 
et al. (1992) and Taillard (1994) (whether the results would be competitive with those of 
recent algorithms such as i-TSAB is a separate question). 

The hypothesis is typically violated for random JSP instances with larger values of j^. 
In these cases it makes more sense to attempt to exploit local clustering of optimal and 
near-optimal schedules. 

9.3.2 i-TSAB 

Nowicki and Smutnicki (2005) present a JSP heuristic called i-TSAB which employs multiple 
runs of the tabu search algorithm TSAB (Nowicki & Smutnicki, 1996). i-TSAB employs path 
relinking to "localize the center of BV [big valley] , probably close to the global minimum" 
(Nowicki & Smutnicki, 2005). In other words, z-TSAB was designed based on the intuitive 
picture depicted in Figure 6 (A) , which is inaccurate for typical random JSP instances with 
^ > 3. Note that although random JSP instances become "easy" as ^ — >■ oo, instances 
with ~ 3 are by no means easy, as evidenced by Figure 9 (B). 

For concreteness, we briefly describe how z-TSAB works. Initially, z-TSAB performs a 
number of independent runs of TSAB and adds each best-of-run schedule to a pool of "elite 
solutions". It then performs additional runs of TSAB and uses the best-of-run schedules 
from these additional runs to replace schedules in the pool of elite solutions. Starting points 
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for the additional TSAB runs are either (i) random ehte solutions or (ii) schedules obtained 
by performing path relinking on a random pair of elite solutions. Given two schedules 5*1 
and S2, path relinking uses a move operator to generate a new schedule that is midway 
(in terms of disjunctive graph distance) between Si and 82- The pool of elite solutions 
can be thought of as a cloud of particles that hovers over the search space and (hopefully) 
converges to a region of the space containing a global optimum. 

For random JSP instances with ~ 1, our results are consistent with the idea that 
the cloud of elite solutions converges to the "center" of the big valley. For random JSP 
instances with ^ > 3, however, the cloud must either converge to one of many big valleys 
or not converge at all. As an alternate approach one can imagine using multiple clouds, 
with the intention that each cloud specializes on a particular big valley. It seems plausible 
that such ideas could improve the performance of z-TSAB on random JSP instances with 
larger values of 

Appendix A: Additional Proofs 

For the proofs in this section, we define r{0) = X^ogo where O is any set of operations. 
We make use of the following inequality (Spencer, 2005). 

Azuma's Perimetric Inequality (A.P.I.). Let X = {Xi,X2, . . . , X^) be a vector ofn in- 
dependent random variables. Let the function f{x) take as input a vector x = {xi,X2, ■ ■ ■ , Xn), 
where Xi is a realization of Xi for i & [n], and produce as output a real number. Suppose 
that for some /3 > it holds that for any two vectors x and x' that differ on at most one 
component, 

\f{x)-f{x')\<p. 

Then for any a > 0, 

F[X > E[X] + aVn\ < exp 

The same inequality holds for P [X < E[X] — a-\/n\. 

Lemma 2. Let I be a random N by M JSP instance. Then for fixed M , it holds whp ( as 
N 00) that the schedule S = »S(7roo, I) has the property that 

S{o) = S+{M{o)) Vo G ops{I) . 

Proof. It remains only to prove Claim 2.1 from the proof in §4, which says that whp, for 
all o G ops^+ (/) we have 

S{o)-SiJ{o))>h[S{o)-SiJio))] . 

Pick some arbitrary operation o G ops'^~^{I), and suppose that the random choices used 
to construct I were made in the following order: 

L Randomly choose mi = m{o) and m2 = J{o). 

2. For k from 1 io N: 
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(a) Randomly choose the order in which job uses the machines (if o ^ then 
part of this choice has already been made in step 1). 

(b) Randomly choose T{J^)\/ie [M\. 

Let the random variable denote the sequence of random bits used in steps (a) and 
(b) of the fc*'^ iteration of the loop. Define Ao = S{o) — S{J{o)). Then, for any fixed choices 
of m,i and m,2, \) is a function of the independent events Xi, X2, . . . , Xjv, and it is easy 
to check that altering a particular Xi changes the value of Aq by at most 2Tmax- Thus 



P[A„<iE[A„]] =P 
< P 



A„ < E[AJ - ^ 
A, < E[AJ - ^ 

- (~2(4Mr!!a,)2) 



where in the first step we have used the fact (from the proof in §4) that E[Ao] = ^^"^ and 
in the last step we have used A.P.I. Taking a union bound over the N(M — 1) operations 
in ops'^^{I) proves the claim. 

□ 

Lemma 3. Let I be a random N by M JSP instance, and let S be a random schedule for 
I. Let S be an optimal schedule for I such that \\S — S\\ is minimal. Let f{M) be any 
unbounded, increasing function of M. Then for fixed N, it holds whp (as M 00) that 
\\S-S\\<f{M). 

Proof Let S = S{tto, I). The proof of Corollary 1 showed that for any J, E[Af] is 0{N'^). 
Thus it holds whp that Ay < Iog(/(M)) VJ. As in the proof of Theorem 5, the procedure 
used to produce S" is a mixture of instance-independent priority rules, each subject to 
Lemma 1. Thus for any J, E[A5] is 0{N), so whp Af < log(/(M)) VJ. 

Let OneariJi) = {J'j ■ J' ^ J,\ Ee<iriJe) - Ej'<j < log(/(M))}. (OneariJi) is 

the set of operations that would be scheduled "near" in time to J, if we ignored the fact 
that a machine may only perform one operation at a time.) Let Enem- = {e = {Jj, Jj} G 
neari'Ji)} ■ Under the assumptions of the previous paragraph (each of which 
hold whp), ||5 - S\\ < \Enear\. For any J^, E[\Onear{Ji)\] is 0(iVlog/(M)). Thus 

E[\\S-S\\]<K[\Enear\]= ^ j^E [\Onear{o)\] = O {NHog{f (M))) 

oEops{I) 

SO \\S — S\\ < f{M) whp by Markov's inequality. □ 

For the purpose of the remaining proofs, it is convenient to introduce some additional 
notation. Let T = (Ti, T2, . . . , T\t\) be a sequence of operations. We define 

• '^{iiM = {Ti:ii<i< i2}, and 



280 



The Landscape of Random Job Shop Scheduling Instances 



Lemma 4. Let I be a random N by M JSP instance, let S be a random schedule for I, 
and let S be an optimal schedule for I such that \\S — S\\ is minimal. Then for fixed M and 
e>0, it holds whp (as N -)■ oo) that \\S - S\\ < Af^+^ 

Proof. Let T be the sequence of operations o G ops{I), sorted in ascending order by priority 
T^randil^o) (where TTrand IS the random priority rule used to create S). Note that for any 
a G ops{I) with J{o) / o®, J{o) must appear before o in T. Let Tj denote the i*^ operation 
in T. 

Consider the schedule S defined by the following procedure: 

1. 5(0) -f- oo Vo G ops{I). 

2. Q i— (). Let Qj denote the j*^ operation in Q. 

3. Let the function ready (o) return true if S~^{M{o)) > S~^{J'{o)), false otherwise. 

4. For z from 1 to NM do: 

(a) If ready{Ti), then set S{o) S^{M{Ti)). Otherwise append Ti onto Q. 

(b) For j from 1 to do: 

i. If ready{Qj), then set S{Qj) S~^{M{Qj)) and remove Qj from Q. 

5. Schedule any remaining operations of Q in a manner to be specified (in the last 
paragraph of the proof). 

The construction of S is just like the construction of S, except for the manipulations 
involving Q. The purpose of Q is to delay the scheduling of any operation o that, if 
scheduled immediately, might produce a schedule in which S{o) > S'^{M.{o)). We first 
show that \\S — S\\ < N^^"^ whp; then wc show that S is optimal whp. 

Let denote Q as it exists after i iterations of step 4 have been performed. Let 
g(o) = "^^J^ |o n Q*! be the number of iterations during which a ^ Q. We claim that 
115-^11 < Eoeopsd) qio) + {N- 1)|Q^^|. Letting = {e e E{I) : e{S) + e(5)}, we have 

\\S -S\\ = |{e G : e n Q^^ = 0}| + |{e G : e n Q^^ + 0}| 
< |{e G : e n Q^^ = 0}| + (AT - \)\Q^^\ 

so it suffices to show \{e ^ : e<^ Q^^ = 0}| < Eoeops(/) ^(o)- To see this, let 
e = {01,02} G E^ be such that eflQ^^ = 0. We must have (?(oi) + (7(02) > 0. We charge e 
to the operation in {01,02} that was inserted into Q first. It is easy to see that an operation 
can be charged for at most one edge per iteration it spends in Q, establishing our claim. 
Thus it suffices to show that \\S - S\\ < Eoeops(/) ^(o) + - l)\Q^^\ < N'^+' whp. 

We divide the construction of S into n = MN2~^ epochs, each consisting of iV2+^ 
iterations of step 4, for a to-be-specified e' > 0. Let zj denote the number of iterations of 
step 4 that occur before the end of the j*'* epoch, with Zj = for j < by convention. Let 

• CJ^ = 2^(0 2 ] \ Q^^ ^6 ^^'^ °f operations that have been scheduled to run on fh by 
the end of the j^^ epoch; and 
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• Onear = [Jje[n]{o ^ ^(^zj-uzj] ■ J{o) ^ Ti^z^_(M+2),zj]} be the Set of operations whose 
job-predecessor belongs to a nearby epoch. 

For any i G [NM], ¥[Ti G Onear] < (M + 2)N-l+'' . Thus for any j G [n], E[|0„ear- n 
T{zj-uzj]\] <{M + '2)N'^^' . Using A.P.I, it is straightforward to show that whp, 

\Onear H T^,,_,,,,] \ < Vj G [u] . (9.1) 

We claim that whp, the following statements hold Vj G [n]: 



near ) 

(9.2) 

i<Zj 

/ ^ |JnQ^^-i nQ^^I < |JnQ^^-i| MJei, (9.3) 

n Q'^i-^' = , and (9.4) 
IQ^^ I < MN^ . (9.5) 

We prove this by induction, where each step of the induction fails with exponentially 
small probability. For j = 0, (9.3) and (9.4) hold trivially. (9.2) is true because the 
operations in r^o,^^] \ Onear are the first operations in their jobs, hence cannot be added to 
Q. (9.5) then follows from (9.2) and (9.1). 

Consider the case j > 0. To show (9.2), let o be an arbitrary operation in T(^^._^^^.j\Onear- 

By the induction hypothesis (specifically, equation (9.4)), J'{o) G C"5^^°^^ Thus q{o) 



> 



O^T (C^_^^^"^^^ > T (Cj!.^^) . By the induction hypothesis. 

Letting A denote the right hand side of this inequality, we have E[A] = jjN^'^^' — 

MN^^ , and A.P.I, can be used to show that for some K > independent of N, P[A < 
0] < exp(-;^Af^'). Thus (9.2) holds with probability at least 1 - exp(-;^iV^'). 

To show (9.3), let J be such that Jr\Q^^-^ / 0, and let Jj G Q^^-^ be chosen so that i is 
minimal. Then J{Ji) G C]'}^^^'^\ Thus J, G Q.^. ^ r [cf}^^^'^^^ > r (cf^'^^ By (9.1), 

(9.2), and the induction hypothesis (equation (9.5)), \Q^^ \ < (M + 1)A^ ^ . Using the same 
technique as above, we can show that (9.3) holds with probability at least 1 — exp(— ^iV"^ ) 
for some K > independent of N. 

(9.3) implies (9.4). (9.2) and (9.4) together with (9.1) imply (9.5). Thus whp, (9.2) 
through (9.5) hold Vj G [n]. 

By (9.2) and (9.4), we have 



E 

and also 



o&ops{I) 



< E[\Onear\]Mm+^' < M'^{M + 2)A^^+2^' 
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E[|Q^^ |] < E[|T(,„_^,,„] n OnearW < (M + 2)N'^ 

SO setting e' = § gives \\S - S\\ < Eoeop.(7) ^(o) + {N - 1)|Q^^| < N^+' whp. 
It remains to show that S is optimal whp. We first prove the following claim. 

Claim 4.1. For any non-negative integers a and b, the probability that Tj^ajb] contains 
two operations from the same job is at most ^^77^ . 

Proof of Claim 4-1- Let X denote the number of pairs of operations in T^a,b] that belong to 
the same job. Then P[X > 0] < E[X] < C*-") ^ < □ 

To see that S is optimal whp, note that the operations scheduled prior to step 5 do not 
cause any idle time on any machine, so it is only the operations in Q^^'^ that can cause S 
to be sub-optimal. Let r(m) = t({o G ops{I) : m{o) = fh}) denote the workload of machine 
fh. Let m = arg max^g [j^^j r(m). Then the following hold whp. 

• The set Z"^ = T™ 1 consists of operations belonging to jobs that use rh 

{NM-2MN^,NM] 

last. (It holds whp that Z™ C Z. where Z = T 1 So if Z"' contains an 

^ {NM-N^ ,NM] 

operation from a job that docs not use m last, then Z must contain two operations 

from the same job. But by Claim 4.1, the probability that this happens is at most 

(ivi)2^ = o(i).) 

• nN* < t{Z"^) and t{Z"^) < r(m) — r(m) Vm / m. (This follows by applying the 
Central Limit Theorem to t{Z'"^), T{m), and r(m)). 

Thus whp it holds that prior to the execution of step 5, S contains a period of length 
at least t{Z'^) > fiN^ during which the only operations being processed are those in Z"^, 
where {o G ops(/) : J(o) G Z™} = 0. Assuming \Q^^\ < N^'' (holds whp), we can always 
schedule the operations in so as to guarantee £{S) = T{rh), which implies S is optimal. 

□ 

Theorem 6. Let I he a random N by M JSP instance with optimal makespan imin{I) o,nd 
let S he a random schedule for I. Then for fixed M and e > 0, it holds whp (as N 00) 
thatl{S) < {l + e)£min{I)- 

Proof. As in the proof of Lemma 4, let T be the sequence of operations o G ops{I), sorted 
in ascending order by priority Trrand{I,o) (where iTrand is the random priority rule used to 
create S). Note that for any o G ops{I) with J{o) 7^ o*^, J{o) must appear before o in T. 
Let Ti denote the i*'* operation in T. 

Rather than analyze S directly, we analyze a schedule S defined by the following pro- 
cedure: 

1. t^O. 

2. For i from 1 to NM do: 
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(a) Set S{Ti) = ma.^{t,S+{J{Ti)),S+{M{Ti))) . 

(b) If S+{J{Ti)) > S+{M{Ti)), set t = ma.iq,<i S+{Ti,). 

The procedure is identical to the one used to construct S, except that, whenever an 
operation T, is assigned a start time S{Ti) > S~^{M-{Ti)), the procedure inserts artificial 
delays into the schedule in order to re-synchronize the machines. For any T, it is clear that 
£{S) < e{S). Thus, it suffices to show that £{S) < (1 + e)<^™„(/) whp. 

We divide the construction of S into n epochs, where each update to t (in step 2(b)) de- 
fines the beginning of a new epoch. Let Zi be the number of operations scheduled before the 
end of the i*^ epoch, with = by convention. Let U = maxj/<2. S'^{oi') be the (updated) 
value of t at the end of the i^^ epoch. Define Aj = ^^—i U — maxj/<j 
Then £(S) — imin{I) < Y17=i ^° suffices to show that Y17=i — ^^min{I) whp. 

Let / = [n], and let L = {i G I : — Zi-i > Nr}. We first consider X^jg^ Aj; then we 
consider J2iei\L ^i- 

2 

Let ii and 12 be arbitrary integers with < ii,i2 < NM and 12 — ii > Nr. Let 
r = t{T^^ ■^^). Then IE[f] = /i^^^, por any T, f is a function of the outcome of at most 
12 — ii events (namely, the definition of each of the jobs in {J : Jn T^ii^jj] 7^ 0})) each of 
which alters the value of r by at most Tmax- 

It follows by A.P.I. that 



P[|f - E[f]| > N''y/i2-ii] < 2exp 




for any e' > 0. Thus, it holds whp that |r — ]E[r]| < iV^\/^2 — h for all possible choices 
of ii and 12- In particular, whp we have Aj < 2MN^ ^/zi — Zj-i <\/i ^ L, which implies 

EieL Aj < Tvf EjsL 2MN''Vn^ = 2MnI+''. 

Now consider Yli(^i\L ^i- shown in the proof of Lemma 4 (Claim 4.1), for any 
non-negative integers a and b the probability that T(^a,b] contains two operations from the 

same job is at most ^^-7^- Thus the probability that an arbitrary subsequence of size 

contains two operations from the same job is at most , so E[|I \ L\] < . Clearly 

< wAtI G / \ L, so E[Eiei\L ^i] is 0{nI). 

Thus EEjg/ Aj] is 0{nI+^') for any e' > 0, so Y,iei Aj < A^^+^'' whp, while it is easy 
to see that lmin{I) > My whp. 

□ 
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